Toward Metric Indexes for Incremental Insertion and Querying

نویسندگان

  • Edward Raff
  • Charles Nicholas
چکیده

In this work we explore the use of metric index structures, which accelerate nearest neighbor queries, in the scenario where we need to interleave insertions and queries during deployment. This use-case is inspired by a real-life need in malware analysis triage, and is surprisingly understudied. Existing literature tends to either focus on only final query efficiency, often does not support incremental insertion, or does not support arbitrary distance metrics. We modify and improve three algorithms to support our scenario of incremental insertion and querying with arbitrary metrics, and evaluate them on multiple datasets and distance metrics while varying the value of k for the desired number of nearest neighbors. In doing so we determine that our improved Vantage-Point tree of Minimum-Variance performs best for this scenario.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of aerobic exercise and fish oil supplements on plasma levels of inflammatory indexes in mice

 Background: Exercise has positive and negative effects on immune system. Herein, we would like to investigate the effects of incremental aerobic training and fish oil supplementation on the plasma levels of CRP, CPK and IL-17 in trained mice. One of the major roles of immune system is to produce soluble or cellular components that provide the immunity against inflammatory agent. The purpo...

متن کامل

Restructuring versus Non Restructuring Insertions in MDF Indexes

MDF tree is a data structure (index) that is used to speed up similarity searches in huge databases. To achieve its goal the indexes should exploit some property of the dissimilarity measure. MDF indexes assume that the dissimilarity measure can be viewed as a distance in a metric space. Moreover, in this framework is assumed that the distance is computationally very expensive and then, countin...

متن کامل

Indexing the Trajectories of Moving Objects in Networks

The management of moving objects has been intensively studied in recent years. A wide and increasing range of database applications has to deal with spatial objects whose position changes continuously over time, called moving objects. The main interest of these applications is to efficiently store and query the positions of these continuously moving objects. To achieve this goal, index structur...

متن کامل

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

Similarity Search In Multimedia Databases Habilitation

The M-tree is a dynamic data structure designed to index metric datasets. In this paper we introduce two dynamic techniques of building the M-tree. The first one incorporates a multi-way object insertion while the second one exploits the generalized slim-down algorithm. Usage of these techniques or even combination of them significantly increases the querying performance of the M-tree. We also ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.05055  شماره 

صفحات  -

تاریخ انتشار 2018